JBIG2 Supported by OCR
نویسنده
چکیده
Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results.
منابع مشابه
Information Extraction from Symbolically Compressed Document Images
The extraction of information from symbolically compressed document images is an increasingly important problem as the related standard (JBIG2) and commercial products become available. Symbolic compression techniques work by clustering individual connected connected components (blobs) in a document image and storing the sequence of occurrence of blobs and representative blob templates, hence t...
متن کاملStorage of multi-component digital maps using JBIG2 image compression standard
Digital maps can be stored and distributed electronically using compressed raster image formats. Spatial access must be implemented to enable the user to operate directly on the compressed data without retrieving the entire image. In this work, we study how the latest JBIG2 standard can be used for storing the map images. The image tiling can be supported quite easily by storing each image bloc...
متن کاملDictionary design for text image compression with JBIG2
The JBIG2 standard for lossy and lossless bi-level image coding is a very flexible encoding strategy based on pattern matching techniques. This paper addresses the problem of compressing text images with JBIG2. For text image compression, JBIG2 allows two encoding strategies: SPM and PM&S. We compare in detail the lossless and lossy coding performance using the SPM-based and PM&S-based JBIG2, i...
متن کاملLossy Compression of Stochastic Halftones with JBIG2
The JBIG2 standard supports lossless and lossy coding models for text, halftone, and generic regions in bi-level images. For the JBIG2 lossy halftone compression mode, halftones are descreened before encoding. Previous JBIG2 descreening implementations produce high-quality images for clustered dot halftones at high compression rates but signi cantly degrade the image quality for stochastic half...
متن کاملNew Public-Key Authentication Watermarking for JBIG2 Resistant to Parity Attacks
An authentication watermark is a hidden data inserted into an image that allows detecting any alteration made in the image. AWTs (Authentication Watermarking Techniques) normally make use of secretor public-key cryptographic cipher to compute the authentication signature of the image, and inserts it into the image itself. Many previous public-key AWTs for uncompressed binary images can be attac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012